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1 Introduction 



During last several decades it was demonstrated that in many fields of ap- 
plied probability the so-called heavy-tailed distributions play an important 
role. One of the main problems, connected with heavy-tailed distributions 
is the estimation of the tail index - a parameter, which characterizes the 
heaviness of the tail of a distribution. The problem can be formulated as 
follows. Let us consider a sample Xi, . . . , X^ of size iV taken from a heavy- 
tailed distribution function (d.f.) F, that is, we assume that X±, . . . ,Xn are 
independent identically distributed (i.i.d.) random variables with a d.f. F 
satisfying the following relation for large x: 

1-F(x) = x~ a L(x). (1) 

Here a > 0, L(x) > for all x > andL is a slowly varying at infinity 
function: 

hm "7TT = L 

x->oo L[X) 

If we have only condition (1) without any additional information about the 
function L, it is difficult to get good properties, such as the asymptotic 
normality, of an estimator of the parameter a. Therefore the main stream 
of papers dealing with the tail index estimation uses the so-called second- 
order condition of regular variation . During last years even the third order 
condition on F was introduced (see, for example Fraga Alves et al(2006)). 
In our paper, like in Paulauskas (2003), we shall use the second order 
condition in the form of the relation (3) below. 

Let X^i < Xjv,2 < ' ' ' ^ ^n,n denote the ordered statistics of 
Xi, . . . ,Xn. Most of tail index estimators are based on ordered statistics 
and estimate the parameter 7 = 1/a. One of the most popular estimators 
to estimate the parameter 7 = 1/a was proposed by Hill (1975): 

j fc-i 

^ { N,k = J, l °£ X N,N-i - hgX NtN _ k , 
i=0 

where k is some number satisfying 1 < k < N. We also list some other 
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estimators based on ordered statistics: 



(2) n o\-ll X N,N-[k/4] — X NN -[ k / 2 ] 

r Nk = (log 2) log — — ^ — ^ , 

-A-N,N—[k/2] — -^-N,N-k 

7{a = 7Si + l-Kl-(7Si) 2 /^)- 1 , 

(4) ^jV 

Tiv.fc - —(if, 

z 'N,k 



where 



j fc-i 

M N = T ^2^0gX N:N _i - \0gX NiN _ k ) 2 . 

i=0 

The estimator 7^ was proposed by Pickands (1975) , 7^ - in Dekkers 

et al (1989) and 7^ - by C. G. de Vries (see e.g. DE Haan and Peng 
(1998)). There are many papers devoted to the modifications of the esti- 
mators 7^ fc , i = 1,2,3,4. Let us mention several of them (in chronological 
order): Weissman (1978), Smith (1987), Resnick and Starica (1997), 
Geluk and Peng (2000), Fraga Alves (2001), Gomes and Martins 
(2002), Fraga Alves et al (2003), Li et al (2008), Gomes et al (2008). 

All estimators 7^ fc , % = 1,2,3,4 contain one additional parameter k, 
which has clear intuitive meaning in the case of all above written estima- 
tors 7^ fc : how many the largest values from the ordered statistics must be 
taken in order to have good properties (consistency, asymptotic normality, 
etc) of the estimator. We presented these four estimators because in de 
Haan and Peng (1998) all these estimators are compared and it is shown 
that none of these estimators dominates the others. It turned out that for 
different values of the parameters 7 and p (the parameters characterizing the 
so-called second-order asymptotic behavior of F, which will be introduced 
below) different estimators have the smallest asymptotic mean-squared error. 

In Davydov et al (2000) (see also Davydov and Paulauskas (1999)) 
there was proposed a new estimator, based on a different idea, which came 
when considering rather abstract objects - random compact convex stable 
sets. Although originally in Davydov et al (2000) an estimator was con- 
structed to estimate the index of a multivariate stable distribution ( even 
there was the restriction < a < 1 for this index) and the main tool in the 
proof was the relation between exponential distribution and ordered statis- 
tics, in Paulauskas (2003) it was noted that the same construction of the 



3 



estimator can be based on a different idea and that this idea can be employed 
in the context of a general tail index estimation. We shall return to this point 
after introducing this new estimator. The construction of the estimator is as 
follows. 

We divide the sample into n groups Vi, . . . ,V n , each group contain- 
ing m random variables, that is, we assume that N = n ■ m and Vi = 
{-X7j_i) m+ i, . . . , X im+ i}. (In practice at first m is chosen and then n = [N/m] 
is taken, where [x] stands for the integer part of a number x > 0.) Let 



M« = max{X,: X 3 E 



(2) 

and let M^' denote the second largest element in the same group V$. Let us 
denote 

M (2) 

Kni = ^y? S n = ^ ^ ^nii Z n = 71 S n . (2) 

From now instead of (1) we require stronger condition. Let us assume 
that we have a sample X\, . . . ,X N from distribution F, which satisfies the 
second-order asymptotic relation (as x — > oo) 

1 - F(x) = C lX - a + C 2 x-P + o(x- p ), (3) 

with some parameters 0<a</3<oo. It seems that Hall (see Hall 
(1982)) was the first who considered this condition in the context of tail 
index estimation and, under this condition, proved asymptotic normality of 
Hill estimator. Note that the case (3 = oo corresponds to Pareto distribution, 
f3 — 2a - to stable distribution with exponent < a < 2. In DE HAAN and 
Peng (1998) (and in many other papers dealing with tail index estimation 
as well) more general second-order asymptotic relation is used in a different 
form with parameters 7 = 1/a and p = a — (3 and in a more general context of 
the extreme- value index, when the parameter 7 can take negative values, too. 
Namely, let U denotes the right continuous inverse of the function 1/(1 — F). 
Suppose that there exits a function A(t), which ultimately has constant sign 
and tends to zero as t — > 00, then the relation 

v U(t) X ^X P -1 

11111 AM = X ' P ^ °> 4 

X^tOG A{t) P 

serves for the definition of the second order regularly varying tail 1 — F. 
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As it was mentioned above, the estimator Z n from (2) (in a different 
context of a sample from multivariate stable distribution) was based on the 
following relation (see LePage et al (1981)) 

K^V-v- (r-^^r- 1 ^), 

where p = Yl]=i^j an d \->3 — 1> are i-i-d. standard exponential random 
variables and —> denotes convergence in distribution, and the fact that 

Ai 



E 



VAi + A 2 / 1 + a' 



Here may be it is worth to mention that most estimation of tail index proce- 
dures, starting from Hill(1975) and Pickands (1975) papers, are based on 
the relation between order statistics and exponential distributions (the Renyi 
representation theorem): if X\, . . . , X n is a sample from a continuous strictly 
increasing d.f. F, F(0) = 0, and X« > • ■ ■ > is the order statistics, 

then 

X^ = F- 1 [exp{-Y^ J (n-j + l)- 1 \ ] , i = l,2,...,n 

and 




\. = (n-j + l){\nF(X^) - lnF(X^)), j = 1,2, . . . , 



n 



In Paulauskas (2003) it was noted that estimator from (2) can be based 
on a different idea. If we take two independent random variables X and Y 
with the same Pareto distribution 

F( x ) = 1 - C lX ~ a , x > C\ /a , 

and denote 

imn(A^) 

max(X,F)' 1 ' 

then it is not difficult to verify that, denoting p = a/(l+a), we have Elf = p 
(since W is invariant under scale transformation, we can take C\ — 1 and 
in the sequel we shall refer to that standard Pareto distribution). 

Therefore in the case of the Pareto distribution quantity Z n , as an estimator 
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for the parameter p (we shall denote it by p and as in Ql (2010) we shall 
call it as DPR estimator), is nothing but the sample mean for a bounded 
random variable, moreover, in this case the best choice is to take m=2. If 
the underlying distribution F is not Pareto, but satisfies (3), then it is natural 
to expect that for large m Ep = Ek„i will be close to p. In Paulauskas 
(2003) the following estimate , which was the main ingredient in the proof 
of the asymptotic normality of the estimator p, was given (see Lemma in 
Paulauskas (2003)) 

1 7m I < C m~ C , (6) 

where 7 m = Ep — p, ( = ((3 — a) /a, and Co is a constant depending on 
Ci,C 2 ,a, and j3. The DPR estimator is a sum of i.i.d. random variables, 
therefore Ep = E/t nl and 7 m gives the bias of our estimator. Based on (6) 
the following result was proved in Paulauskas (2003). 

Theorem 1. Let us suppose that F satisfies (2) with < a < j3 < oo. If we 

choose 

n = e N N 2 ^ 1+2 <\ m = e- N 1 N 1 ^ 1+2 <\ 
where En 0, as N — >■ oo , then 

MP-P) ^N^ooN(0,a 2 ), (7) 
where a 2 = lim^oo a 2 = a((a + 1) 2 (« + 2)) _1 . 

Now we can give the exact asymptotic behavior of 7 m , this allows us to 
choose m in an optimal way and to compare DPR estimator with these four 
estimators, listed above, in the same manner as it was done in DE Haan 
and Peng (1998) (therefore in the title of the paper there are words "once 
more"). Our main result can be formulated as follows. (We write a n ~ b n if 
lim^oo G^fe" 1 = 1. ) 

Theorem 2. Let us suppose that F satisfies (2) with < a < (5 < oo and 
C\ > 0. Then we have 

Irn ~ X"^, ("I -> OO), (8) 

where 

X = x{Ci,C 2 ,a,/3) 



L7f +1 (a + l)(/3 + l)' 
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For sufficiently large N ( ensuring that m opt > 2) taking 




(9) 



we get that MSE (mean square error) is minimal 




1/(1+2C) 



(10) 



Under this choice of m we have asymptotic normality 



y/n(p - p) 



D 



(11) 



AT-S.OO 



where a 2 is the same as in Theorem 1 and fi = er(2£) x l 2 sgn(x). 

Remark 3. The estimator p (as all other introduced above tail index es- 
timators) is invariant with respect to scale transformation, while condition 
(3) is not: if a random variable X\ satisfy this condition, a random variable 
AX\, where A > with distribution function Fa satisfies the relation 



therefore the constants in the second order relation are not invariant. But it 
is easy to see that the ratio Cf/C^ is invariant and all quantities in relations 
(8), (9), and (10) depend exactly only on this ratio. 

As it was mentioned, this result allows to compare the estimator p with 
four estimators mentioned above, and this is done in Section 3. Although ac- 
cording to the chosen criteria Hill estimator 7^ and estimator 7^ asymp- 
totically perform better than the estimator p, relation between other two 

estimators and p is the same as in paper DE Haan and Peng (1998): for 

(2) 

some values of parameters a, p estimator p performs better than X N k and 

(3) 

7jy k . But here it is worth to mention the simple structure of the DPR estima- 
tor and the fact that it is well suited for recursive calculations (for example, 
when we have the so-called tick-by-tick financial data and we need tail index 
estimation in real time), see the monograph Markovich (2007). There are 
situations (such examples are mentioned in Ol(2010)) when data can be di- 
vided naturally into blocks but only few of largest observations within blocks 



1 - F A (x) = C x A a x~ a + C 2 A^x- p + o(x~P), 
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are available. In such situations estimator p can be applied while all other 
mentioned estimators are not applicable. Also the estimator p is well adapted 
for detecting a change in tail index, see a paper Gadeikis and Paulauskas 
(2005), where estimator p was used to analyze financial crisis in Asian mar- 
kets in 1997-98 and the results were compared with analogous analysis using 
Hill estimator in Quintos et al(2001). And the main factor why we think 
that DPR estimator deserves the attention of statisticians is the possibility 
of several promising modifications of the estimator. One such modification 
is to introduce an additional parameter r > and to consider the estimator 



Again, using standard Pareto distribution, it is easy to calculate that p r 
estimates the quantity a(r + a)~ l . As a matter of fact, when preparing 
the paper PAULAUSKAS (2003) the first named author had considered the 
estimator p2 (which to some extent resembles the quantity Mat, see definition 
of estimators ^\ and "fffk ), but realized that there is no gain in changing 
the first moment by the second one. Now it turns out that it is worth to 
take < r < 1, and we are able to prove asymptotic normality (under 
appropriate assumptions on m) for a fixed r. Also we can show that between 
two estimators with fixed parameters < r' < r" < 1, the smaller asymptotic 
MSE has p r '. Unfortunately, at present we do not know how to choose 
optimally r, which in general can be dependent on a, (3, and even N. 

Let us consider general construction of modifications of the DPR estima- 
tor. Take some function / : [0, 1] — > [0, 00] such that ~Ef(W) exists where W 
is from (5), then this expectation will be some function of a and, of course, 
on function /. Let us denote this function by hf(a), that is hf(a) = E f(W). 
If hf is a one-to-one map from [a, b] to [c, d] with [a, b] and [c, d] being subsets 
of [0, 00], then estimating the quantity hf(a) and taking the inverse function 
we get an estimator for a (with the restriction a<a<bif0<a<b< 00). 
Therefore it is natural to consider statistic of the form 



obtaining large class of modifications of the estimator p, developed in 
Paulauskas (2003). The estimator p is obtained taking fi(x) = x, then 



n 





(12) 



i=l 



8 



hf l (a) = a/(l + a). The above mentioned modification p r is obtained by 
taking f r (x) = x r , r > and hf r (a) = h r (a) = a/(r + a). Estimators of the 
type (12) we shall call generalized DPR estimators, in short GDPR. 

In a recent paper Ql (2010) one more estimator is introduced, which 
can be considered as connecting ideas of DPR and Hill estimators At first 
the procedure is the same as for DPR estimator - division of the sample in n 
groups with m elements in each group. But then instead of taking two largest 
elements in each group Qi takes Hill estimator in each group, namely, taking 
s+l(l<s<m — 1) largest values in each group, then averaging them over 
groups and obtaining the following estimator of the parameter 7 = a -1 : 

^) = ^EE ( lo § M ^ - lo g M » +1) )> (13) 

Too 

1=1 J=l 

where > . . . is ordered statistic from V{. With s = 1 estimator (13) 
becomes of the form (12) with fe(x) = — logrr. It is not difficult to calculate 
that for this function / we get he(a) = a -1 . 

Having possibility to choose several functions in construction of GDPR 
estimators, natural question is what properties of these functions ensures 
better results in estimating a. Comparing two functions fi(x) = x and 
fe(x) = — logx we see that corresponding functions h\(a) — a/(l + a) and 
he(a) = a^ 1 have quite different ranges: the first one has a small range - inter- 
val (0, 1), while the second one as a range have all half line (0, 00). Moreover, 
this fact results in different behavior of derivatives of inverse functions 

T P h " {p) = t p (i^) = jr^w = {a + 1)2 ' 
d l-u \ d h\ 1 2 

^ (7) = T P \i) = ~i 1 ~ = a - 

For small values of a (this corresponds to small values of p and large values 
of 7) the derivative of the first function is almost one, while for the second 
function it tends to zero as 7~ 2 . This means that even big changes in the 
value of estimated quantity 7 results only in small changes of estimated 
value of a. For large values of a (as p — > 1 or 7 — > 0) behavior of both 
derivatives is the same, but, evidently, large values of a are not so interesting 
in the problem of tail index estimation. These considerations explain why 
Qi estimator (13) with s = 1 (or, in other words, GDPR estimator with the 
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function fg) performs better than DPR estimator (with the function fi) and 
also suggest one more modification of DPR estimators. Namely, if we take 
the same function f r , but with negative parameter r (to stress this we shall 
write f- r , r > 0), there will appear restriction a > r, but now the range of 
the function h^ r (a) = a(a — r)^ 1 is infinite interval (1, oo) and the behavior 
of the derivative of inverse function is very similar to that of hj 1 . We are able 
to show that in the case of function f_ r it is possible to find optimal choice of 
r and GDPR estimator with this optimal r is comparable with estimator (13) 
with s = 1 in a sense that for some values of a, (3 one estimator has smaller 
asymptotic MSE, for other values - dominates another one. Investigation of 
all these modifications were carried while the first version of the paper was 
in the process of refereeing and the results with proofs are collected in a 
forthcoming paper PAULAUSKAS and VAICIULIS (2010). 

One more remark concerning Theorem 2 is appropriate here. In Ql (2010) 
it is mentioned that using the same method of the proof of asymptotic nor- 
mality for estimator (13) (that is, using the relation between ordered statistics 
and exponential distributions) it is possible to prove (11). Our proof of (11) 
essentially differs since it does not use exponential distributions and the main 
tool in the proof is formula (14). It is worth to mention that one can get 
the results for estimator (13) with s = 1 by using this formula. This is 
demonstrated in the above mentioned forthcoming paper. 

The paper contains two more sections. In Section 2 we prove Theorem 2 
and in Section 3 there are results on comparison of estimators. 



Proof of (8). Relation (8) gives the exact asymptotic of the bias Ep — p. 
Generally, the exact asymptotic of the bias of a tail index estimator is rather 
difficult problem. The advantage of our estimator p is a relative simplicity 
of the proof of (8). We do not use asymptotic for the inverse function for 
1 — F(x) as in Paulauskas (2003), but rather simple form of the expectation 



which will be proved below. We truncate the outer integral at the level 



2 Proof of Theorem 2 



Ep = 1 — m 




(14) 



a 



= Kin 



1/a (lnm) 



1/a 



(15) 



'TO 
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where < k < (Ci/C) 1/a and denote 



K m , 2 = ml F m ~\x 2 )\j —-±±jdx 2 , 



'hi 



Now, (8) follows immediately from the following two relations 

K mA = o(m^), (16) 
l-p-K mj2 ~ X m ~~ C , m -> 00. (17) 

To prove (16), split K mj i into two parts: if m ,i = K' m l + if^!, where 

By the change of integration order to get 

F m - 1 (x 1 )dF(x 1 ) = F m (a m ). 



ran 

< m I 
Jo 



An assumption (3) and a simple inequality ln(l — x) < —x, < x < 1 yield 

F m (a m ) < C (1 - Cia"")" 1 < Ce mln(1 - Cia » a) < Cm^ 1 ^" 

for sufficiently large m, hence, taking into account (15), we get K' ml = 
o (m- f )- Relations J Q am F m_1 (x) dx = O (a m F m - 1 (a m )) and a;" 1 dF(x) = 
O (a~ a_1 ) prove 

<i=0(ma-«F m - 1 (a m ))=o(m^), 

and we have (16). 

Now let us prove (17). Integrating by parts the inner integral we get 

K m , 2 = m fr-^J^-^ + ^-^+o^- 1 )) dx. 
A m l«+l 0+1 J 
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Denote F(x) = 1 — C\X a — C 2 x then one can write 

K m ,2 = K' m2 + K'^ + R m , 

where 

K' m ,2 = —. 7 / F m ~\x)dF(x), 



dm 

CO 



< 2 = C 2 (3m ( ^1- - -1- ) / x -^F™-\x) dx, (18) 



0>m 



and 

-Rm = -Rm,l + -Rm,2 (19) 

with 

Rm,i = m£ [F m ~\x)-F m -\x))! [ ^ i x^ + ^ i x-^dx, 

/•CO 

R mi2 = ml F m - 1 {x)o (x-P- 1 ) dx. 

J am 

Integration gives 1 — p — K' m2 = (j? m (a m )J = o (m _l >). We claim that 

K,2--Xm~ C , (m->oo). (20) 

Simple calculations show that (20) can be derived from the following two 
relations: 



/•CO 

/ exp{— Ci(m — l):r~ a }:r~' 3 ~ 1 dx 

J am. 



'am LtLy^ 



m-P /a , (21) 



/ {P™- 1 ^) - exp{-Ci(m - l)aT a } j x" 13 ' 1 dx = o (m'^ a ) . (22) 
Making a change of variables y = C\[m — l)x~ a one has 

/•CO 

/ exp{-Ci(m - l)x- a }x-P- 1 dx 

J a m 

12 



where T(-) is a standard gamma function and 

poo 

Y{a,x) = / t a - l e- l dt 

J X 

is an upper incomplete gamma function. Keeping in mind that a^m — > oo, 
we have 

T{pia,Ci{m-\)atf)< — \ - T ( 1 + ^ 0, (m ^ oo). 

Ci{m-l)a m a V «/ 

This ends the proof of (21). 

To prove (22), consider the difference A m (x) := F m (x) — exp{— Cimx~ a }. 
We assume that m is large enough that inequalities 

< CVC + <W < 1/2, l + ^ a -^>0 (23) 

are satisfied. We recall that only C\ is supposed positive, while C 2 may 
be negative. The second inequality in (23) ensures monotonous decay of a 
function 1 — F(x), x > a m and implies < 1 — F(x) < 1/2 for x > a m . Then 
we can write : 

In F(x) = -C t x- a - C 2 x~^ + r(x), 

where \r(x)\ < 2 (C\ + \C2\) 2 x~ 2a . Using this relation, for sufficiently large 
m, we have 

|A m (x)| = exp{— Cimx~ a } exp ^C 1 mx~ a + mlnF(a;)| — 1 

= exp{— Cimx~ a } |exp {— m (C 2 a; _/3 — r(x))} — l| 
< Cmi- (2aA/3) exp{-dmx- a }. 

Consequently, left hand side of (22) does not exceed 

m f°° e W {-CM - l)*-*}*-^)^" 1 dx = o (m 1 ' 2 ^) . (24) 

If 2a > (3, then r.h.s. of (24) is o (m 1 " 2 ^) = o(m _/3 / Q ), while, in the 
case 2a < (3, we have o ( m i-(2<*A/?)/a-/3/a) = Q ^-p/a-iy ThuS; ( 2 2) and, 

consequently, (20), are proved. 
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To finish the proof of (17), it remains to prove that the remainder term 
R m from (19) is negligible, that is, R m ,i = ° { m ~^) > i = 1,2. We have 



\R m ,i\ < Cm 



pm— 1 , 



X 



,-im— 1 1 



X 



-a-1 



dx. 



The remainder term in (3) denote by h(x), that is, h(x) = 1 — F(x) — 
Cix~ a — C 2 x~P = x~^hi{x) where hi(x) = o(l), as x — > oo. Let us rewrite 
the difference in the integrand as follows 



rim— 1 



x) 



F m ~ 1 (x) 



vm—1 I 



x) I exp < (m — 1) In ( 1 



h(x) 

W). 

h(x)/F(x) < 1/2 is satisfied, 



One can assume that for x > a m inequality 
thus from the Taylor expansion of In ^1 — h(x)/F(x)j it follows that there 

exist a constant C > such that | In (l - h(x)/F(x) y j | < C\h(x)\/F(x). 

Since for x > a m the product (m — l)x _/3 tends to zero as m — > oo, we can 
assume that inequality 



(m — 1) 



In 1 + 



h(x) 
F{x) 



< 1 



holds. Then, by applying inequality (e^ — 1| < C\x\, < |x| < 1, we get 

F m ~ 1 (x) — F m ~ 1 (x) < (m- l)^™- 1 ^) 

< Cim-ljF^Wlhix)]. 
Applying the last inequality we have 



F(x) 



RmA < Cm 



nm— 1 , 



x) \h{x)\ X 



-a-1 



dx 



< Cma m a sup \hi(x) 



x 



- 1 F m - 1 (x) dx. 



Taking into account (21)-(22) we obtain R mil < Ca m a m *» sup x>am \hi(x)\ 
o (m _< >) . In a similar way we get 



\R m ,2\ < Cm ^ sup \h(x)\ — o (m *>) 
14 



and the proof of (17) is completed. 

To complete the proof of (8) it remains to prove (14). The random vari- 
ables K n> i, . . . , K nin , defined in (2), are i.i.d.. Therefore Ep = Ek Bi i. Now it 
is clear that 

Ep = ml / ^/ x 2 dF(x 2 ) dF(x 2 )... dF(x m ) 

Jo x i Jo Jo Jo 

= (m - l)m J™ { jH x 2 F m ~\x 2 ) dF(x 2 )| 
Integrating by parts the inner integral we get 

Ep = m r {x 1 F m - 1 (x 1 )- r F m -\x 2 )dx 2 ^ 



poo f l>Xl 

Jo \Jo 



= l-m / < / F m -\x 2 )dx 



) xi 

dF(xi) 



It remains to change order of integration to conclude the proof of (14). 

Proof of (11)- Since the proofs of (11) and (7) are essentially the same, we 
shall give main steps only. In view of decomposition 

1 n 

v n i= i 

the assertion (11) follows from 



1 n 

—j= ~ E — >n ~>°° ^c 3 ' °" 2 )' 

v n ^"T 



(25) 



v^Tm -> /i, (m = JTiopt, N oo). (26) 

To prove (25) check Lyapunov condition for i.i.d. random variables forming 
triangular array, while relation (26) one can verify by using (8) and (9) with 

m = m opt . 

Proof of (10). From (11) we know that the variance E (p — Ep) 2 asymptot- 
ically equals a 2 m/N . Taking the main term in the asymptotic relation (8) 
we get that the asymptotic mean squared error of p equals x 2 m~ 2( > + a 2 m / TV '. 
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Since this is a very simple function with respect to m it is easy to see that a 
solution of a minimization problem 

inf {x 2 m~ 2C + a 2 m/N} (27) 

2<m<N L ' i V ' 

is given by (9). Here it is necessary to note that we require that N is suf- 
ficiently large, since for values of ( close to the solution of minimization 
problem (for a given N) may be smaller than 2. Also one can note that 
instead of taking main term from (8) we can take the sequence 7 m and apply 
Lemma 2.8 in Dekkers and DE Haan (1993), this will give the same result. 
Having (9)one can easily get (10). The theorem is proved. 



3 Comparison of estimators 

In this section we compare the tail index estimator p with the estimators 
7iVfc> 3 = 1, 2, 3, 4, using the same method as in DE Haan and Peng (1998). 
May be here it is worth to mention that in a recent paper Nematollahi 
and Tafakori (2007) there was proposed another approach to compare tail 
index estimators, but this method of comparison is well adapted to a specific 
estimator introduced by fan in fan (2004). We recall that p estimates 
the quantity p = a /(a + 1), while the four above mentioned estimators 
estimate 7 = 1/a, that is, different function of the unknown parameter 
a. Therefore the first step in comparison is to transfer the estimators to the 
same function, and we had chosen to compare the estimator p with estimators 
Pn\ = V(l + 1n\)- We need the following simple statement. 

Lemma 4. Suppose (3) holds. Let = k®(N), j = 1, 2, 3, 4 be a sequences 
of integers with 

k U) (N) ->• 00 and k {j) (N)/N ->• 0, (N — >■ 00) (28) 

and let estimators 7^ are asymptotically normal, i.e. there exist constants 
bj G K. and o~j > such that 

{4\ - 7) ^^00 A%, a]). (29) 
Then 2 

^ - p) n jSir) ■ (30) 
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Proof. We use the obvious identity 

VW) ( 7 - 7 w ) 



(l + 7)(l + 7&) 



Now relation (30) follows from this identity, Theorem 4.4 in Billingsley 

(1968), relation (29) and the relation 7^ — )■ 7, (N — > 00). The lemma is 
proved. 

De Haan and Peng proved (see Thm.2 in DE Haan and Peng (1998)) 
that the asymptotic second moment of 7^ — 7 is minimal and equals 

( kij) y 1(T k( l + 2C)/(2C), if satisfies the relation 

2 

lim k^A 2 {N/k^) = j = 1,2,3,4, (31) 



where 



D 

and 



N^oo v ' ' 2(D 2 k 

D 1 n - 1 a C 

1 ~ l + C' 3 i + C (i + C) 2 ' 

■„ = 1 ~ 2 " C ^(V^-C _ n £), = I 

2 (2V«-l)ln2 C { ' (l + C) 2 ' 



2_2_ 2 _ 1 + 2 2/ " +1 2 _ 1 + Q 2 2_^_ 



are limit variances in (29). The function A(t) in (31) has the asymptotic 

c c 

We recall that the function A(t) was introduced in (4). 
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Denote by k„J t a sequence satisfying (31). From (31), we have 
/(i j-rVir 1/(1+20 

K °* y "> ^2C(l-2-<) 2 (2V<»-<-l)2 ( Cj )2 j 

k e><m ~ ( <i±ML±*) (g.r- V 7 "" ^/^, 

~ ( (i y )4( ^j° v /<1+2C '^ /(i+2c '- 

Under normalization k^ t (N) instead of k^(N) in (29) the limit random 
variable has a mean 



/2C 

Moreover, Lemma 4 imply 



E 



sgn (d, Jirn^ y/j^{N)A (N/k^ t (N))) ■ 



Now it is possible to compare the estimator p with the estimators as 
in was done in DE Haan and Peng (1998), i.e., by calculating a limit of the 
ratio of minimal mean squared errors: 

RMMSE(j) = lim E ^~ P \ 2 . 
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From (10) and (32) we have the following results: 



RMMSE{1) = (r](a,f3)T 2 (2 + C)) 1/(1+20 , 
RMMSE{2) = (2 (1/a) - l) 2 ln 2 (2) (^(a, (3) x 

C 2 r 2 (1+C) X l/d+2C) 

X 



RMMSE(3) = \ri(a,p) 
RMMSE(A) = (r](a,p) 



(1 - 2-C) 2 (2( 2 /«)+i + i)2C(2V«-C - 1)2 

(i + C) 2 r 2 (2 + C) \ 1/(1+20 



;i + « 2 ) 2 qi + c-«0 2 

;i + C)2r 2 (2 + c) x 1/(1+20 



2 2C 



, Hn • l)\ 2 f(a • L) 2 

rj{a,p) = 



where 

^a(/3 + l)/ V«(« + 2 ), 
It is easy to conclude that RMMSE(l) > 1 for all < a < (3 i.e., Hill 
estimator dominates estimator p. Due to the inequality (a + l) 6 — 
4a 3 (a + 2) > 0, for a > (it follows from the binomial formula), the same 
conclusion is valid for de Vries estimator pff k . 

Comparison of estimators p, pffl k and p$ k is shown in Figures la-lc. a 
values are on the horizontal axis, while vertical axis labels (3 values. In all 
three figures the area {(a,/3) : < (3 < a} (those values of parameters 
that are not considered) is left as white. In Figure la the area {(a, (3) : 
RMMSE{2) > 1} is in black and {(a,/3) : RMMSE(2) < 1} is in dark 

(3) 

grey. Similarly, Figure lb presents comparison of the estimators p and p N k - 
as in Figure la, the area {(a, (3) : RMMSE(3) < 1} is in dark grey and 
the area {(a, (3) : RMMSE(3) > 1} - in light grey. Finally, Figure lc gives 
areas of domination estimators p (dark grey), p Nk (black), and p Nk (light 
grey). 

As it was mentioned in the Introduction, in DE Haan and Peng (1998) 

(7) 

the comparison of the estimators 7^ fe , j = 1,2,3,4 was performed with 
respect to the parameters (7, p). For the sake of completeness we include 
analogous of the Figures la-lc in the plane (p, 7) also. In the Figures 2a- 
2c the horizontal axis labels 7 values, while vertical axis labels p values. 
As in Figures la-lc, the area where estimator p has an asymptotic mean 
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Figure 1: la, lb and lc figures 



squared error smaller than the other estimator(s) is in dark grey. A black 

(2) (3) 

and light grey colors mark the areas of domination of estimators p K N ' k or p N k , 
respectively. 




1 2 " 3 " 4 1 2 " " 3 " " 4 1 2 " 3 4 



Figure 2: 2a, 2b and 2c figures 
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